The Strength of Varying Tie Strength 



Jeroen Bruggeman* 
2012 



Abstract 

"The Strength of Weak Ties" argument [13] says that the most 
valuable information is best collected through bridging ties with other 
social circles than one's own, and that those ties tend to be weak. Aral 
and Van Alstyne [3j added that to access complex information, actors 
need strong ties ("high bandwidth") instead. These insights I general- 
ize by pointing at actors' interest to avoid spending large resources on 
low value information. Weak ties are well-suited for relatively simple 
information at low transmission and tie maintenance costs, whereas 
for complex information, the best outcomes are expected for those ac- 
tors who vary their bandwidths along with the value of information 
accessed. To support my claim I use all patents in the USA (two 
million) over the period 1975 — 1999. I also show that in rationalized 
fields, such as technology, bandwidth correlates highly with the value 
of information, which provides support for using this proxy if value 
can't be measured directly. Finally, I show that betweenness central- 
ity is a better measure for brokerage than the often used constraint, 
and explain why. 



Introduction 

How do people get valuable information0 to solve their problems and to 
satisfy their needs? Sometimes they can get it by their own intelligence 
or experience. Most of the times, however, they use their network as a 



*Department of Sociology and Anthropology, University of Amsterdam, Oudezijds 
Achterburgwal 185, 1012 DK Amsterdam, the Netherlands. Email: j.p.bruggeman@uva.nl. 
1 thank Sinan Aral and Nina O'Brien for commenting on the first draft. 

^Depending on the context, the value of information can be attributed to its relevance, 
accuracy, reliability, novelty, scope, timing, rareness, legitimacy, or a combination of (some 
of) these factors, in order to achieve a (more) beneficial outcome. 



1 



radar and filter, compare what they hear and see (and read, in modern 
societies), and attempt to combine useful information into solutions [S]. To 
make beneficial comparisons and combinations, it turns out that, in general, 
diverse information is key [25] . 

From a network perspective, detailed knowledge about the content of ties 
is usually lacking; it is a challenge to model information diversity in a general 
way, and predict beneficial use of it in a broad range of fields. Would network 
diversity, i.e. a focal node's connections to mutually disconnected nodes, be 
sufficient, or would additional indicators be necessary? A large portion of the 
literature confirms that for ego (a focal actor) to access diverse information, 
(s)he should broker across "structural holes" between alters, or information 
sources in general, in different social groups [HIEZ]- Within groups, people 
are highly clustered; this clustering fosters mutual comprehension and co- 
ordination. But continual interactions render shared information relatively 
homogeneous. Rather, as the most valuable information is heterogeneous, 
clustering constrains access to it [H]. Moreover, within-group ties are rela- 
tively strong with regards to engagement and time spent, whereas between- 
group ties tend to be weak [21] . Consequently, "average" brokers mostly use 
weak ties to access diverse information across different groups. 

In a recent paper, Aral and Van Alstyne [3J proposed, by contrast, the 
notion that for ego to obtain diverse but complex or rapidly changing infor- 
mation, (s)he must use strong ties, called high bandwidth in those contexts, 
even if this implies a loss of some structural holes being brokered. Although 
strong ties' benefits have been noticed in previous studies [28l |16], Aral and 
Van Alstyne support this trade-off by an impressive data set collected from 
an executive recruiting firm, as it exceptionally contained not only network 
ties but also the content of these ties. Among their findings, the authors 
showed that information diversity is indeed enhanced by network diversity. 
This finding supports earlier studies in which tie content was largely un- 
known. In their case study, by helping to shorten project duration, efficiency 
was the benefit of diverse information. 

In most cases, however, it is unlikely that the information value is the 
same across sources. One wonders, then, if an optimal average bandwidth 
could predict the highest benefits. For sure, actors should avoid spending 
large resources on sources that provide low value information, and will benefit 
more if they focus on the best sources they can get access to. In this research 
note I therefore propose a straightforward generalization of Aral and Van 
Alstyne's thesis that incorporates bandwidth variation: for bounded rational 
actors in complex or turbulent fields, who have to trade off network diversity 
for bandwidth, the best outcomes are expected for those who vary their 
bandwidths along with the value of information accessed. 



2 



In fields, such as science, technology and in knowledge-intensive indus- 
tries, where valuable information is complex, actors have to invest time and 
effort in certain sources, oral or written, and achieve a skillful command of 
the knowledge they will have acquired through them. To successfully broker 
and cross-fertilize complex information, accessing those sources must there- 
fore be accompanied by specialization in those sources. For these actors, 
specialization is not only a process of accumulating knowledge in a given do- 
main, but also of interrelating their knowledge more densely such that they 
may discover shortcuts and workarounds. In a service industry, for example, 
this means that employees have to acquaint themselves with their colleague's 
skills, knowledge and personalities, their clients' wishes and idiosyncrasies, 
and learn effective solutions to the problems at hand. Clearly, for obvious 
cognitive limitations, nobody can specialize in a great many alters or sources 
simultaneously; this implies a diversity-bandwidth trade-off. This conclusion 
also holds true for collectives, even though they are able to process more 
information than individuals. 

However, we know from science studies that "standing on the shoulders 
of giants," i.e. using the most valuable sources available, has strong posi- 
tive effects, both on individuals' careers and on the accumulation of public 
knowledge |4j. Furthermore, successful PhD students often have good super- 
visors [20]; "good" philosophers owe part of their success to being connected 
to other good philosophers [TT]; and, successful innovations build on prior 
successful ideas [S]. In all these examples, information value varies across 
sources, and obtainable benefits vary with them: this corroborates my con- 
jecture stated above. Because none of the examples is based on a research 
design specifically targeted toward testing the conjecture, I will test it here 
using a longitudinal data set of patent citations that I describe in the next 
section. I show and discuss the results in the final two sections, respectively. 

Data and measures 

For the empirical test I use publicly available data on a network in which 
the nodes are "invisible colleges" of inventors (n = 409), at the aggregate 
level, and which incorporate all patents in the USA (two million) over the 
period 1975—1999 (USPTO)ll The administrative units corresponding to 
these colleges of inventors are technology domains wherein patents are cate- 

^The data were harvested in 2001 from http://www.uspto.gov/. I exclude from the 
initial 418 domains nine inactive domains that have no outdegree, because several network 
measures become minus infinity after log transforming them for model (3); they do not 
alter the outcome. 



3 



gorized. These units are both stable and non-overlapping over the period of 
observation, while other units are non-stable, overlapping, or both [17J. Like 
scientific citations, patent citations (as directed ties in the network) show 
which ideas have been (re) combined into a new idea. At the level of tech- 
nology domains, weighted ties indicate information flows at that level. The 
information transmitted in this field is complex, but not rapidly changing. 
The period of observation is partitioned into non-overlapping sub-periods, 
lasting five years each, consistent with other studies using patent data [26j. 
Over the 25 years of observation, the density of the network (m/n^, for a net- 
work with loops) gradually increased from 0.217 in the first period to 0.395 
in the last, while the average path length (concatenation of ties from node a 
to node h) shrank from 1.83 to 1.62. 

Actors can self-specialize by re-using knowledge produced earlier. At 
the level of a domain, self-specialization means that a myriad of individ- 
uals within the domain cite each other and sometimes themselves. Self- 
specializing individuals build upon earlier experiences, which they integrate 
with their current experiences or with others' information. In a network, 
self-specialization is indicated by a tie from a node to itself (reflexive arc). 
For technology domains, self-specialization ties are the strongest on average, 
and increased from 1508 to 7287 over the period of observation, compared 
to 1086 to 6722 for dyadic ties, respectively, in the range from to 79337. 
Further details of these data have been presented and discussed elsewhere 

To measure nodes' brokerage (network diversity), I use two measures: con- 
straint [5], which Aral and Van Alstyne used, and betweenness [UlIS]. Con- 
straint depends on two assumptions: first, that actors are less constrained and 
in a better brokerage position if they have higher network diversity (which 
is what we want to measure), and secondly, that actors are less constrained 
if they divide their time and attention equally across their contacts [5J, thus 
having a low variation of tie strength. To test my conjecture, however, this 
variation has to be measured separately (see below). Betweenness is more 
parsimonious than constraint and independent from tie strength. Whereas 
betweenness normally takes into account all shortest paths through a focal 
node, recent studies have shown that information further away than ego's di- 
rect connections does not contribute to ego's brokerage opportunities [HE]. 
In line with these findings I truncate shortest paths longer than one tie, thus 
take into account only direct ties between ego and her alters. We may call 
this measure between-two-ness to contrast it with the earlier notion that 
boiled down to between-everybody-ness. 

Bandwidth (average tie strength) can be calculated by weighted outdegree 
(nr. of citations) divided by outdegree (nr. of cited domains). For its varia- 



4 



tion, I use an entropy measure [12] to facilitate comparison with other studies. 
This measure is based on proportional tie strengths, Pij, to be reached by 
row-normalizing the adjacency matrix, after ensuring that the arcs point to 
the direction of citations (or of information asked, in other studies). The 
measure is normalized for the number of ties, which have already been taken 
into account in between-two-ness. Entropy is highest if all ties are equally 
strong, and low if the focal node has a few strong ties and weak ties with its 
remaining contacts. For node i with (out) degree ki, normalized tie strength 
entropy is calculated by 



Information value is arguably difficult to measure in general, but in the 
technology and science fields one can measure citation impact [15]. This 
indicates, crudely, how valuable or relevant others find certain information 
to be to their own inventions or research, respectively. Although this measure 
is incomplete and possibly biased as with regards to individual patents, noisy 
data can be informative at the aggregate level of technology domains |l8j. I 
include self-citations, because they indicate part of the economic viability of 
a domain. (It turns out that this makes hardly any difference; see footnote 
4.) In short, I sum the columns of the adjacency matrix. A, to create a 
(column) vector of citation impact, y, for each of the five sub-periods. 

Because it takes time to patent, and to separate cause and effect, the 
predictors are lagged over one five-year period, L = t — 1, where t is an index 
of time periods and L is the lag. For the response variable, benefits obtained 
by using certain information, I use the focal nodes' current citation impact, 
because it indicates the economic value for the patent holders [T8] . 

To assess the effect of a node's variation of tie strengths along the value 
of its sources, I construct a measure of network autocorrelation [211 HH]; 
which each weighted outgoing tie is multiplied by the value (citation impact) 
of the node being citedo In order to count but not to square self-citations, 
the diagonal of A should be binarized, resulting in a matrix A', and my 
conjecture can then be formalized as 



From entropy and bandwidth, self-citations are excluded, and the parametrized 
model is 

■^Usually, the adjacency matrix is row-normalized for autocorrelation models, under the 
assumption that, if a focal actor is influenced by more alters — in this case by citing more 
domains — the effect of each specific alter diminishes in the larger numbers. However, in 
the technology domains, in which many people work, this assumption seems implausible. 





y ~ A'lYl- 



(2) 



5 



y = Xi/3 + pA'lYl + e 



(3) 



Because of unobserved heterogeneity, e.g. variation in R&D spending across 
domains and time, the model must be expanded to a fixed effect panel model 
with time dummies. 

For all variables except entropy, the logarithm is used to make their dis- 
tributions symmetrical, to straighten their curvilinear relations before com- 
puting the correlations (Table 1), and to use them appropriately in model 
(3). For the same reasons, entropy is raised to the power of 1.5. 





cited 


auto 


band 


betw 


const 


entro 


cited 




0.92 


0.86 


0.85 


-0.35 


-0.48 


autocorrelation 






0.90 


0.78 


-0.29 


-0.56 


bandwidth 








0.65 


-0.07 


-0.67 


between-two-ness 










-0.58 


-0.29 


constraint 












-0.46 


mean 


7.93 


16.33 


2.50 


3.93 


-2.19 


0.651 


SD 


1.55 


2.06 


0.871 


1.81 


0.498 


0.106 



Table 1: Correlation matrix; the predictors have their one-period lag. For all 
variables, the logarithm is used, except entropy that is raised to the power of 1.5. 



Results 

The results are presented in Table 2, in which the overall effect of the five 
time periods is summarized in the adjusted E? in the first column. For 
all models and effects in them, P < 0.001. Model Auto 1 shows a strong 
effect of network autocorrelation]^ To verify if this outcome supports the 
conjecture, I establish two comparisons: first, I examine if for a domain's 
given sources, varying bandwidth across those souces is more beneficial than 
just having ties with them. To this end I binarize the adjacency matrix, 
recompute the model, and, as expected, the result (Auto 2) has a lower 
adjusted R^. Secondly, I examine the effects of average bandwidth and its 
variation separately, as well as their interaction. In this field with complex 
information, having high bandwidth on average is beneficial (Band 1), a result 

'^When modeling autocorrelation without self-citations, thus by setting the diagonal of 
A to zero, the significance stays virtually the same as in Auto 1 while the adjusted 
becomes a little lower (0.608 versus 0.611 with self-citations). 



6 



that strongly supports Aral and Van Alstyne's thesis. However, low variation 
of tie strength (high entropy) relates negatively to citation impact (Entro), 
and shows that varying one's bandwidth is also beneficial, not only having 
high bandwidth on average. The conjecture (2) implies that the stronger a 
node's ties are on average, the more beneficial it is to vary their strength. We 
should thus find an interaction effect between bandwidth and its variation, 
and indeed there is such an interaction (Band 2). Taken together, these 
findings strongly support the conjecture. 

We already knew from many earlier studies that brokerage contributes 
to success, although in this case its contribution measured by between-two- 
ness is relatively small|f| The frequently used measure of constraint is not 
significant, neither on its own (0.133, which is the wrong sign; t = 1.829) 
nor with other variables. It rests on the assumption that a low variation of 
bandwidth is best, whereas here the opposite is true. Consequently, entropy 
nullifies the network diversity portion of constraint. 

Finally, because researchers in many studies don't know the value of in- 
formation that actors access, I am interested in determining how well one can 
predict this parameter by using proxies. Adding between-two-ness to model 
Band 2 raises the adjusted (Band 3), and shows that we can actually 
explain success nearly as well as when having extensive knowledge on the 
value of information tapped (Auto 1). This is a reassuring thought. 

Discussion 

Aral and Van Alstyne (2011) showed that in fields with complex or rapidly 
changing information, actors benefit from trading off network diversity for 
bandwidth. This case study on technology domains generalized their finding 
by showing, through network autocorrelation, that variation of tie strength 
along the value of sources is more beneficial than just high average band- 
widthH Computer simulations [22] have shown that in dense networks, such 
as technology domains, the effect of autocorrelation is systematically under- 
estimated. Therefore, the effect of varying tie strength is even stronger than 

^Regressing citation impact on between-two-ness on its own is highly significant, al- 
though its explanatory power is lower than that of autocorrelation and bandwidth (0.331; 
P < 0.001; adj. = 0.559). Adding between-two-ness to model Auto 1 raises the ad- 
justed a little, to 0.613. These numbers hardly change with a more strict reading of 
brokerage, wherein a structural hole between sources j and / is counted only when focal 
actor i cites both j and /, not (also) when i is sitting on a path from j to / or from I to j. 

®In Aral and Van Alstyne's study, the variance of bandwidth is smaller than the mean 
(Table 3, p. 117). Accordingly, if the value of information from various sources is fairly 
similar, their finding is a special case in my general conjecture. 



7 



Time Auto 1 Auto 2 Band 1 Entro Band 2 Band 3 
autocor 0.512 
(0.018) 

aut-binary 0.700 

(0.033) 

bandwidth 0.831 

(0.037) 

entropy -2.432 

(0.207) 

band • ent -0.624 -0.689 

(0.138) (0.130) 

between 0.232 

(0.018) 

adj. 0.573 0.611 0.579 0.584 0.540 0.585 0.605 
F 1046 1369 1049 1096 795 745 748 

Table 2: Fixed effect models over four 5-year periods with a one-period lag; 
n=1630. For the models and all effects in them, P < 0.001. 



estimated here. Although this effect was assessed in this study at the ag- 
gregate level of domains, it is very likely to be true for organizations and 
individuals as well. 

Collecting data on the value of information is not always feasible. This 
study suggests that in complex fields that are rationalized in a Weberian 
sense, and when actors are capable, with some margin of error, to select 
the information that is most valuable to them, the interaction effect of tie 
strength and its variation is a good proxy. This proxy might be also ap- 
plied to fields such as professional sports, architecture, haute cuisine, service- 
industries, and others in which extensive schooling or training are required. 
However, if people are less concerned with the value of information when 
establishing their ties, e.g. when the main content is emotional exchange or 
support, bandwidth will not correspond to information value. Yet, entropy 
(eq. 1) would still be a valid indicator of bandwidth variation. 

The positive effect of brokerage is no surprise and is a well-known phe- 
nomenon. It is a surprise, however, that the often used measure of constraint 
cannot, in this case, notice this effect. Bctween-two-ness is more effective, 
because it only takes the diversity of tics into account without an underlying 
assumption of tie strength variation. Constraint rests on the assumption that 
low variation (high entropy) is beneficial, whereas in fields with complex in- 
formation, the opposite turns out to be true. In earlier studies of such fields 



8 



where constraint has been used, the effect of brokerage was underestimated. 

In general, there is also a transmission of simple information in complex 
fields. It is yet another reason, aside from varying information value, for ac- 
tors to vary their tie strength. Low variation of tie strength is advantageous in 
fields with generically simple information. This was shown in another study, 
on British telephone data, which were stripped of content for the sake of 
privacy [12]. The nodes in that network were postal code-delimited commu- 
nities, as sources and recipients of information. Tie strength was measured by 
time spent (volume of calls) by one node calling another node over a period of 
one month. There, more equally divided attention across sources, indicated 
by high entropy, correlated positively with a community-aggregated index of 
economic welfare (r = 0.73). In that study, it is clear that in the large num- 
bers of calls, exchanges of sophisticated information were by far outnumbered 
by more mundane exchanges. For all those simple subject matters, even valu- 
able ones, strong ties imply redundancy rather than progressive knowledge 
refinement. Dedicating a great deal of attention to relatively few sources has 
therefore no advantages, or only briefly, while it precludes people and their 
communities from getting more valuable information elsewhere. 

Part of potential benefits depends on how well actors can integrate new 
information with their prior knowledge. Actors' absorptive capacity for new 
information is enhanced by specialization [10]. These actors are then better 
able to notice valuable information amidst noise, including brokerage oppor- 
tunities that laymen overlook. "Chance favors the prepared mind," as Louis 
Pasteur said. 

Over a longer period of time, actors might exploit their sources to a point 
where their potential for novelty runs dry. They can then dis-intensify old 
ties and establish new ones, accumulate specializations over time, and alter- 
nate or combine specializing with brokering [D]. Both cross-sectionally and 
dynamically, it is beneficial for them to vary the strength of their ties. Seen 
from a historical perspective, it is clear that the potential benefits of combin- 
ing information have increased considerably since industrialization [291 ES] , 
if irregularly and unequally, and the importance of varying tie strength has 
increased accordingly. 

References 

[1] Jac M. Anthonisse. The rush in a directed graph. Technical report, 
Mathematisch Centrum, Amsterdam, 1971. 



9 



[2] Sinan Aral and Marshall W. Van Alstyne. Network structure and infor- 
mation advantage. Proceedings of the Academy of Management Confer- 
ence, 2007. 

[3] Sinan Aral and Marshall W. Van Alstyne. The diversity-bandwidth 
tradeoff. American Journal of Sociology, 117:90-171, 2011. 

[4] Lutz Bornmann, Felix de Moya Anegon, and Loet Leydesdorff. Do 
scientific advancements lean on the shoulders of giants? A bibliometric 
investigation of the Ortega hypothesis. PLoS ONE, 5:el3327, 2010. 

[5] Ronald S. Burt. Structural Holes: The Social Structure of Competition. 
Cambridge. Harvard University Press, 1992. 

[6] Ronald S. Burt. Structural holes and good ideas. American Journal of 
Sociology, 110:349-399, 2004. 

[7] Ronald S. Burt. Neighbor Networks: Competitive Advantage Local and 
Personal. Oxford University Press, Oxford, 2010. 

[8] Gianluca Carnabuci. The ecology of technological progress: How sym- 
biosis and competition affect the growth of technology domains. Social 
Forces, 88:2163-2187, 2010. 

[9] Gianluca Carnabuci and Jeroen Bruggeman. Knowledge specialization, 
knowledge brokerage, and the uneven growth of technology domains. 
Social Forces, 88:607-641, 2009. 

[10] Wesley M. Cohen and Daniel A. Levinthal. Absorptive capacity: A 
new perspective on learning and innovation. Administrative Science 
Quarterly, 25:128-152, 1990. 

[11] Randall Collins. The sociology of philosophies: A precis. Philosophy of 
the Social Sciences, 30:157-201, 2000. 

[12] Nathan Eagle, Michael Macy, and Rob Claxton. Network diversity and 
economic development. Science, 328:1029-1031, 2010. 

[13] Linton C. Freeman. A set of measures of centrality based on between- 
ness. Sociometry, 40:35-41, 1977. 

[14] Mark Granovetter. The strength of weak ties. American Journal of 
Sociology, 78:1360-1380, 1973. 



10 



[15] Zvi Griliches. Patent statistics as economic indicators: A survey. Journal 
of Economic Literature, 28:1661-1707, 1990. 

[16] Morten T. Hansen. The search-transfer problem: The role of weak ties in 
sharing knowledge across organization subunits. Administrative Science 
Quarterly, 44:82-111, 1999. 

[17] Rebecca Henderson, Adam Jaffe, and Manuel Trajtenberg. Patent cita- 
tions and the geography of knowledge spillovers: A reassessment: Com- 
ment. American Economic Review, 95:461-464, 2005. 

[18] Adam B. Jaffe and Manuel Trajtenberg. Patents, Citations, and Innova- 
tions: A Window on the Knowledge Economy. MIT Press, Cambridge, 
MA., 2002. 

[19] Roger Th.A.J. Leenders. Modehng social influence through network au- 
tocorrelation: constructing the weight matrix. Social Networks, 24:21- 
47, 2002. 

[20] R. Dean Malmgren, Julio M. Ottino, and Luis A. Nunes Amaral. The 
role of mentorship in protege performance. Nature, 465:622-626, 2010. 

[21] Peter V. Marsden and Noah E. Priedkin. Network studies of social 
influence. Sociological Methods and Research, 22:127-151, 1993. 

[22] Mark S. Mizruchi and Eric J. Neuman. The effect of density on the 
level of bias in the network autocorrelation model. Social Networks, 
30:190-200, 2008. 

[23] Joel Mokyr. The Gifts of Athena: Historical Origins of the Knowledge 
Economy. Princeton University Press, Princeton, NJ, 2002. 

[24] J. P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, 
J. Kertesz, and A. L. Barabasi. Structure and tie strengths in mobile 
communication networks. Proceedings of the National Academy of Sci- 
ences, 104:7332-7336, 2007. 

[25] Scott E. Page. The Difference: How the Power of Diversity Creates Bet- 
ter Groups, Firms, Schools, and Societies. Princeton University Press, 
Princeton, NJ, 2007. 

[26] Joel M. Podolny, Toby E. Suart, and Michael T. Hannan. Networks, 
knowledge, and niches: Competition in the worldwide semiconductor 
industry, 1984-1991. American Journal of Sociology, 102:659-689, 1996. 



11 



[27] Katherine Stovel. Brokerage. Annual Review of Sociology, 38:139-158, 
2012. 

[28] Brian Uzzi. Social structure and competition in interfirm networks: The 
paradox of embeddedness. Administrative Science Quarterly, 42:35-67, 
1997. 

[29] Martin L. Weitzman. Hybridizing growth theory. American Economic 
Review, 86:207-212, 1996. 



12 



